You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Robert Metzger <rm...@apache.org> on 2016/04/04 10:32:19 UTC

Re: scaling a flink streaming application on a single node

Hi,

usually it doesn't make sense to run multiple task managers on a single
machine to get more slots.
Your machine has only 4 CPU cores, so you are just putting a lot of
pressure on the cpu scheduler..

On Thu, Mar 31, 2016 at 7:16 PM, Shinhyung Yang <sh...@gmail.com>
wrote:

> Thank you for replying!
>
> I am trying to do this on a single machine in fact. Since it has 64
> cores, it would be interesting to look at the performance in that
> regard.
>
> > How many machines are you using for this?
> >
> > The fact that you are giving 64 slots to each TaskManager means that a
> > single TaskManager may end up executing all 64 pipelines. That would
> heavily
> > overload that TaskManager and cause heavy degradation.
>
> Does it make sense if I run multiple TaskManagers on a single machine
> if 64 slots are too many for a TaskManager?
>
> > If, for example, you use 16 machines, then give each machine 4 task slots
> > (total of 64 slots across all machines)
> > That way, the final run (parallelism 64) will be guaranteed to be spread
> > across all machines.
>
> My intention for the experiment at the moment is to try to scale the
> application up on a single machine to its maximum before moving on to
> run the experiment on multiple machines.
>
> Thank you again!
> With best regards,
> Shinhyung Yang
>

Re: scaling a flink streaming application on a single node

Posted by Ufuk Celebi <uc...@apache.org>.
Just to clarify: Shinhyung is running one a single node with 4 CPUs,
each having 16 cores.

On Mon, Apr 4, 2016 at 10:32 AM, Robert Metzger <rm...@apache.org> wrote:
> Hi,
>
> usually it doesn't make sense to run multiple task managers on a single
> machine to get more slots.
> Your machine has only 4 CPU cores, so you are just putting a lot of pressure
> on the cpu scheduler..
>
> On Thu, Mar 31, 2016 at 7:16 PM, Shinhyung Yang <sh...@gmail.com>
> wrote:
>>
>> Thank you for replying!
>>
>> I am trying to do this on a single machine in fact. Since it has 64
>> cores, it would be interesting to look at the performance in that
>> regard.
>>
>> > How many machines are you using for this?
>> >
>> > The fact that you are giving 64 slots to each TaskManager means that a
>> > single TaskManager may end up executing all 64 pipelines. That would
>> > heavily
>> > overload that TaskManager and cause heavy degradation.
>>
>> Does it make sense if I run multiple TaskManagers on a single machine
>> if 64 slots are too many for a TaskManager?
>>
>> > If, for example, you use 16 machines, then give each machine 4 task
>> > slots
>> > (total of 64 slots across all machines)
>> > That way, the final run (parallelism 64) will be guaranteed to be spread
>> > across all machines.
>>
>> My intention for the experiment at the moment is to try to scale the
>> application up on a single machine to its maximum before moving on to
>> run the experiment on multiple machines.
>>
>> Thank you again!
>> With best regards,
>> Shinhyung Yang
>
>

Re: scaling a flink streaming application on a single node

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I am not sure since people normally don't run Flink on such large machines.
They rather run it on many smaller machines.

I will definitely be interesting too see your new results where the Job can
actually use all the memory available on the machine.

--
aljoscha

On Mon, 4 Apr 2016 at 15:54 Shinhyung Yang <sh...@gmail.com> wrote:

> Dear Aljoscha and Ufuk,
>
> Thank you for clarifying! Yes I'm running this wordcount application
> on a 64-core machine with 120GB ram allocated for users.
>
> > In that case, the amount of RAM you give to the TaskManager seems to low.
> > Could you try re-running your experiments with:
> > jobmanager.heap.mb: 5000
> > taskmanager.heap.mb: 100000
> >
> > So, about 100 GB of RAM for the TaskManager.
>
> Definitely I will try this! The result will be really interesting for
> sure. In this case, am I still good to go with 64 task slots with a
> single task manager?
>
> Thank you.
> With best regards,
> Shinhyung Yang.
>

Re: scaling a flink streaming application on a single node

Posted by Shinhyung Yang <sh...@gmail.com>.
Dear Aljoscha and Ufuk,

Thank you for clarifying! Yes I'm running this wordcount application
on a 64-core machine with 120GB ram allocated for users.

> In that case, the amount of RAM you give to the TaskManager seems to low.
> Could you try re-running your experiments with:
> jobmanager.heap.mb: 5000
> taskmanager.heap.mb: 100000
>
> So, about 100 GB of RAM for the TaskManager.

Definitely I will try this! The result will be really interesting for
sure. In this case, am I still good to go with 64 task slots with a
single task manager?

Thank you.
With best regards,
Shinhyung Yang.

Re: scaling a flink streaming application on a single node

Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
I'm afraid no one read your email carefully. You indeed have one very big
machine with 64 physical CPU cores and 120 GB of RAM, correct?

In that case, the amount of RAM you give to the TaskManager seems to low.
Could you try re-running your experiments with:
jobmanager.heap.mb: 5000
taskmanager.heap.mb: 100000

So, about 100 GB of RAM for the TaskManager.

Cheers,
Aljoscha

On Mon, 4 Apr 2016 at 10:32 Robert Metzger <rm...@apache.org> wrote:

> Hi,
>
> usually it doesn't make sense to run multiple task managers on a single
> machine to get more slots.
> Your machine has only 4 CPU cores, so you are just putting a lot of
> pressure on the cpu scheduler..
>
> On Thu, Mar 31, 2016 at 7:16 PM, Shinhyung Yang <sh...@gmail.com>
> wrote:
>
>> Thank you for replying!
>>
>> I am trying to do this on a single machine in fact. Since it has 64
>> cores, it would be interesting to look at the performance in that
>> regard.
>>
>> > How many machines are you using for this?
>> >
>> > The fact that you are giving 64 slots to each TaskManager means that a
>> > single TaskManager may end up executing all 64 pipelines. That would
>> heavily
>> > overload that TaskManager and cause heavy degradation.
>>
>> Does it make sense if I run multiple TaskManagers on a single machine
>> if 64 slots are too many for a TaskManager?
>>
>> > If, for example, you use 16 machines, then give each machine 4 task
>> slots
>> > (total of 64 slots across all machines)
>> > That way, the final run (parallelism 64) will be guaranteed to be spread
>> > across all machines.
>>
>> My intention for the experiment at the moment is to try to scale the
>> application up on a single machine to its maximum before moving on to
>> run the experiment on multiple machines.
>>
>> Thank you again!
>> With best regards,
>> Shinhyung Yang
>>
>
>