You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by Ankit Malhotra <am...@appnexus.com> on 2017/01/30 22:50:34 UTC
Hardware configurations for samza clusters
Hi - I am just curious if people can share hardware configurations on which you have been running Samza? We are evaluating samza for a streaming join use case which makes heavy use of RocksDB where the store spills to disk for most joins. Specifically, how many cores/memory/SSDs (types of SSDs)/RAID configs etc.
Thanks
Ankit
Re: Hardware configurations for samza clusters
Posted by Jagadish Venkatraman <ja...@gmail.com>.
When using Samza to process streaming data (kafka/databus), we deploy to
Yarn clusters dedicated to Samza workloads. The configurations of machines
in this cluster are roughly similar to what I provided.
When using Samza to process batch data (files on hadoop
<https://reviews.apache.org/r/52570/>), we deploy to our hadoop clusters
that are shared with other M-R workloads. I believe these clusters use
spinning disks.
For the future, We plan to explore trade-offs in storage-costs versus
performance and will continue to share what we learn with the community.
Thanks,
Jagadish
On Tue, Jan 31, 2017 at 1:38 PM, Ankit Malhotra <am...@appnexus.com>
wrote:
> Hi Jagadish,
>
> Thanks for your reply. Is it safe to assume that you are running similar
> machines in production YARN clusters where only SAMZA workloads run?
>
> Ankit
>
> > On Jan 31, 2017, at 3:49 PM, Jagadish Venkatraman <
> jagadish1989@gmail.com> wrote:
> >
> > Hi Ankit,
> >
> > We have benchmarked Samza on the following hardware configuration:
> >
> > - Processor: Intel Xeon 2.67 GHz processor (with 24 cores)
> > - 48GB of RAM
> > - 1Gbps Ethernet
> > - SSD: 1.65TB Fusion-IO SSD
> >
> > Please check out the perf numbers and the methodology here:
> > https://engineering.linkedin.com/performance/benchmarking-
> apache-samza-12-million-messages-second-single-node
> >
> > Thanks,
>
>
--
Jagadish V,
Graduate Student,
Department of Computer Science,
Stanford University
Re: Hardware configurations for samza clusters
Posted by Ankit Malhotra <am...@appnexus.com>.
Hi Jagadish,
Thanks for your reply. Is it safe to assume that you are running similar machines in production YARN clusters where only SAMZA workloads run?
Ankit
> On Jan 31, 2017, at 3:49 PM, Jagadish Venkatraman <ja...@gmail.com> wrote:
>
> Hi Ankit,
>
> We have benchmarked Samza on the following hardware configuration:
>
> - Processor: Intel Xeon 2.67 GHz processor (with 24 cores)
> - 48GB of RAM
> - 1Gbps Ethernet
> - SSD: 1.65TB Fusion-IO SSD
>
> Please check out the perf numbers and the methodology here:
> https://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node
>
> Thanks,
Re: Hardware configurations for samza clusters
Posted by Jagadish Venkatraman <ja...@gmail.com>.
Hi Ankit,
We have benchmarked Samza on the following hardware configuration:
- Processor: Intel Xeon 2.67 GHz processor (with 24 cores)
- 48GB of RAM
- 1Gbps Ethernet
- SSD: 1.65TB Fusion-IO SSD
Please check out the perf numbers and the methodology here:
https://engineering.linkedin.com/performance/benchmarking-apache-samza-12-million-messages-second-single-node
Thanks,