You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Marchant, Hayden " <ha...@citi.com> on 2017/12/07 12:53:35 UTC

Hardware Reference Architecture

Hi,

I'm looking for guidelines for Reference architecture for Hardware for a small/medium Flink cluster - we'll be installing on in-house bare-metal servers. I'm looking for guidance for:

1. Number and spec of  CPUs
2. RAM
3. Disks
4. Network
5. Proximity of servers to each other

(Most likely, we will choose YARN as a cluster manager for Flink)

If someone can share a document or link with relevant information, I will be very grateful.

Thanks,
Hayden Marchant


Re: Hardware Reference Architecture

Posted by Kostas Kloudas <k....@data-artisans.com>.
Hi Hayden,

This is a talk from Flink Forward that may be of help to you:
https://www.youtube.com/watch?v=8l8dCKMMWkw <https://www.youtube.com/watch?v=8l8dCKMMWkw>

and here are the slides:
www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3 <http://www.slideshare.net/FlinkForward/flink-forward-berlin-2017-robert-metzger-keep-it-going-how-to-reliably-and-efficiently-operate-apache-flink/3>

Kostas

> On Dec 7, 2017, at 6:36 PM, Kostas Kloudas <k....@data-artisans.com> wrote:
> 
> Hi Hayden,
> 
> It would be nice if you could share a bit more details about your use case and the load that you expect to have,
> as this could allow us to have a better view of your needs.
> 
> As a general set of rules:
> 1) I would say that the bigger your cluster (in terms of resources, not necessarily machines) the better.
> 2) the more the RAM per machine the better, as this will allow to fit more things in memory without spilling to disk
> 3) in the dilemma between few powerful machines vs a lot of small ones, I would go more towards the first, as this 
>     allows for smaller network delays.
> 
> Once again, the above rules are just general recommendations and more details about your workload will give us 
> more information to work with.
> 
> In the documentation here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals>
> you can find some details about deployment, monitoring, etc.
> 
> I hope this helps,
> Kostas
> 
>> On Dec 7, 2017, at 1:53 PM, Marchant, Hayden <hayden.marchant@citi.com <ma...@citi.com>> wrote:
>> 
>> Hi,
>> 
>> I'm looking for guidelines for Reference architecture for Hardware for a small/medium Flink cluster - we'll be installing on in-house bare-metal servers. I'm looking for guidance for:
>> 
>> 1. Number and spec of  CPUs
>> 2. RAM
>> 3. Disks
>> 4. Network
>> 5. Proximity of servers to each other
>> 
>> (Most likely, we will choose YARN as a cluster manager for Flink)
>> 
>> If someone can share a document or link with relevant information, I will be very grateful.
>> 
>> Thanks,
>> Hayden Marchant
>> 
> 


Re: Hardware Reference Architecture

Posted by Kostas Kloudas <k....@data-artisans.com>.
Hi Hayden,

It would be nice if you could share a bit more details about your use case and the load that you expect to have,
as this could allow us to have a better view of your needs.

As a general set of rules:
1) I would say that the bigger your cluster (in terms of resources, not necessarily machines) the better.
2) the more the RAM per machine the better, as this will allow to fit more things in memory without spilling to disk
3) in the dilemma between few powerful machines vs a lot of small ones, I would go more towards the first, as this 
    allows for smaller network delays.

Once again, the above rules are just general recommendations and more details about your workload will give us 
more information to work with.

In the documentation here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals <https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/yarn_setup.html#background--internals>
you can find some details about deployment, monitoring, etc.

I hope this helps,
Kostas

> On Dec 7, 2017, at 1:53 PM, Marchant, Hayden <ha...@citi.com> wrote:
> 
> Hi,
> 
> I'm looking for guidelines for Reference architecture for Hardware for a small/medium Flink cluster - we'll be installing on in-house bare-metal servers. I'm looking for guidance for:
> 
> 1. Number and spec of  CPUs
> 2. RAM
> 3. Disks
> 4. Network
> 5. Proximity of servers to each other
> 
> (Most likely, we will choose YARN as a cluster manager for Flink)
> 
> If someone can share a document or link with relevant information, I will be very grateful.
> 
> Thanks,
> Hayden Marchant
>