You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Ali Nazemian <al...@gmail.com> on 2018/07/04 13:57:37 UTC

Kafka disk recommendation for cloud

Hi All,

I was wondering what the recommendations are for disk type for hosting
Kafka on a cloud environment? As far as I know, most of the best practices
suggest using spinning disks for Kafka due to the fact that Kafka
architecture relies on sequential write/read. Hence, the increase in Kafka
performance by using SSD disks wouldn't be very cost-effective. However, on
a cloud environment, it might be a different story due to hard limitations
on IOPS. For an on-prem solution, the avg IOPS a spinning disk is very low
(about 100-200), but when it comes to sequential IOPS can increase to
20k-30k based on different factors. However, for a cloud solution, there is
a different story. For example, Azure limits every Spinning disk to 500
IOPS whether it is random or sequential. It means we should be able to get
500eps max per each disk, right? Therefore, does it mean using SSD disks
for Kafka would be recommended for cloud providers?

What about using RAID0 vs JBOD for Kafka Brokers? I can see various
recommendations to use RAID0 or JBOD, but I am not really sure which one is
recommended especially for a Cloud environment?

Regards,
Ali

Re: Kafka disk recommendation for cloud

Posted by Ali Nazemian <al...@gmail.com>.
Hi Dan,

Thanks for the reply. It's not many publishers right now, but it should
become many publishers. Message size is pretty small, but we use batch
writing. Based on architecture, Kafka relies on seq read/write, but still
SSD might be a cost effective option in the case of too many publishers?

Cheers,
Ali

On Wed, 11 Jul. 2018, 04:48 Dan Rosanova, <da...@microsoft.com.invalid>
wrote:

> In Azure we recommend using managed disks for Kafka. HD Insight Kafka uses
> them. I generally see SSD for Kafka, but I guess part of that could depend
> on if you write larger writes from fewer publishers or small writes from
> many publishers. What does your workload look like?
>
> Kind Regards,
> -Dan
>
> -----Original Message-----
> From: Ali Nazemian <al...@gmail.com>
> Sent: Wednesday, July 4, 2018 6:58 AM
> To: users@kafka.apache.org
> Subject: Kafka disk recommendation for cloud
>
> Hi All,
>
> I was wondering what the recommendations are for disk type for hosting
> Kafka on a cloud environment? As far as I know, most of the best practices
> suggest using spinning disks for Kafka due to the fact that Kafka
> architecture relies on sequential write/read. Hence, the increase in Kafka
> performance by using SSD disks wouldn't be very cost-effective. However, on
> a cloud environment, it might be a different story due to hard limitations
> on IOPS. For an on-prem solution, the avg IOPS a spinning disk is very low
> (about 100-200), but when it comes to sequential IOPS can increase to
> 20k-30k based on different factors. However, for a cloud solution, there is
> a different story. For example, Azure limits every Spinning disk to 500
> IOPS whether it is random or sequential. It means we should be able to get
> 500eps max per each disk, right? Therefore, does it mean using SSD disks
> for Kafka would be recommended for cloud providers?
>
> What about using RAID0 vs JBOD for Kafka Brokers? I can see various
> recommendations to use RAID0 or JBOD, but I am not really sure which one is
> recommended especially for a Cloud environment?
>
> Regards,
> Ali
>

RE: Kafka disk recommendation for cloud

Posted by Dan Rosanova <da...@microsoft.com.INVALID>.
In Azure we recommend using managed disks for Kafka. HD Insight Kafka uses them. I generally see SSD for Kafka, but I guess part of that could depend on if you write larger writes from fewer publishers or small writes from many publishers. What does your workload look like?

Kind Regards,
-Dan

-----Original Message-----
From: Ali Nazemian <al...@gmail.com> 
Sent: Wednesday, July 4, 2018 6:58 AM
To: users@kafka.apache.org
Subject: Kafka disk recommendation for cloud

Hi All,

I was wondering what the recommendations are for disk type for hosting Kafka on a cloud environment? As far as I know, most of the best practices suggest using spinning disks for Kafka due to the fact that Kafka architecture relies on sequential write/read. Hence, the increase in Kafka performance by using SSD disks wouldn't be very cost-effective. However, on a cloud environment, it might be a different story due to hard limitations on IOPS. For an on-prem solution, the avg IOPS a spinning disk is very low (about 100-200), but when it comes to sequential IOPS can increase to 20k-30k based on different factors. However, for a cloud solution, there is a different story. For example, Azure limits every Spinning disk to 500 IOPS whether it is random or sequential. It means we should be able to get 500eps max per each disk, right? Therefore, does it mean using SSD disks for Kafka would be recommended for cloud providers?

What about using RAID0 vs JBOD for Kafka Brokers? I can see various recommendations to use RAID0 or JBOD, but I am not really sure which one is recommended especially for a Cloud environment?

Regards,
Ali