You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Jason Rosenberg <jb...@squareup.com> on 2013/05/17 23:29:39 UTC

heterogenous kafka cluster?

Hi,

I'm wondering if there's a good way to have a heterogenous kafka cluster
(specifically, if we have nodes with different sized disks).  So, we might
want a larger node to receive more messages than a smaller node, etc.

I expect there's something we can do with using a partitioner that has
specific knowledge about the hosts in the cluster, but this feels messy, to
have this config on every producer client....

Thoughts?

Jason

Re: heterogenous kafka cluster?

Posted by Jason Rosenberg <jb...@squareup.com>.
Yeah,

I thought of that (running 2 kafkas on one box), but it doesn't really add
the benefit of redundancy through replication (e.g. if we have 2 replicas
mapping to the same physical machine).

Jason


On Fri, May 17, 2013 at 2:50 PM, Chris Riccomini <cr...@linkedin.com>wrote:

> Hey guys,
>
> I have no idea if this would be reasonable, but what about just running
> two Kafka processes on the bigger box?
>
> Cheers,
> Chris
>
> On 5/17/13 2:48 PM, "Jason Rosenberg" <jb...@squareup.com> wrote:
>
> >Just resource allocation issues.  E.g. imagine having an existing kafka
> >cluster with one machine spec, and getting access to a few more hosts to
> >augment the cluster, which are newer and therefore have twice the disk
> >storage.  I'd like to seamlessly add them into the cluster, without having
> >to replace everything en masse.  Thus, it would be nice for the newer ones
> >to take proportionally more load based on the relative storage available,
> >etc.
> >
> >Jason
> >
> >
> >On Fri, May 17, 2013 at 2:34 PM, Neha Narkhede
> ><ne...@gmail.com>wrote:
> >
> >> That does seem a little hacky. But I'm trying to understand the
> >>requirement
> >> behind having to deploy heterogeneous hardware. What are you trying to
> >> achieve or optimize?
> >>
> >> Thanks,
> >> Neha
> >>
> >>
> >> On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com>
> >>wrote:
> >>
> >> > Hi,
> >> >
> >> > I'm wondering if there's a good way to have a heterogenous kafka
> >>cluster
> >> > (specifically, if we have nodes with different sized disks).  So, we
> >> might
> >> > want a larger node to receive more messages than a smaller node, etc.
> >> >
> >> > I expect there's something we can do with using a partitioner that has
> >> > specific knowledge about the hosts in the cluster, but this feels
> >>messy,
> >> to
> >> > have this config on every producer client....
> >> >
> >> > Thoughts?
> >> >
> >> > Jason
> >> >
> >>
>
>

Re: heterogenous kafka cluster?

Posted by Chris Riccomini <cr...@linkedin.com>.
Hey guys,

I have no idea if this would be reasonable, but what about just running
two Kafka processes on the bigger box?

Cheers,
Chris

On 5/17/13 2:48 PM, "Jason Rosenberg" <jb...@squareup.com> wrote:

>Just resource allocation issues.  E.g. imagine having an existing kafka
>cluster with one machine spec, and getting access to a few more hosts to
>augment the cluster, which are newer and therefore have twice the disk
>storage.  I'd like to seamlessly add them into the cluster, without having
>to replace everything en masse.  Thus, it would be nice for the newer ones
>to take proportionally more load based on the relative storage available,
>etc.
>
>Jason
>
>
>On Fri, May 17, 2013 at 2:34 PM, Neha Narkhede
><ne...@gmail.com>wrote:
>
>> That does seem a little hacky. But I'm trying to understand the
>>requirement
>> behind having to deploy heterogeneous hardware. What are you trying to
>> achieve or optimize?
>>
>> Thanks,
>> Neha
>>
>>
>> On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com>
>>wrote:
>>
>> > Hi,
>> >
>> > I'm wondering if there's a good way to have a heterogenous kafka
>>cluster
>> > (specifically, if we have nodes with different sized disks).  So, we
>> might
>> > want a larger node to receive more messages than a smaller node, etc.
>> >
>> > I expect there's something we can do with using a partitioner that has
>> > specific knowledge about the hosts in the cluster, but this feels
>>messy,
>> to
>> > have this config on every producer client....
>> >
>> > Thoughts?
>> >
>> > Jason
>> >
>>


Re: heterogenous kafka cluster?

Posted by Jason Rosenberg <jb...@squareup.com>.
Just resource allocation issues.  E.g. imagine having an existing kafka
cluster with one machine spec, and getting access to a few more hosts to
augment the cluster, which are newer and therefore have twice the disk
storage.  I'd like to seamlessly add them into the cluster, without having
to replace everything en masse.  Thus, it would be nice for the newer ones
to take proportionally more load based on the relative storage available,
etc.

Jason


On Fri, May 17, 2013 at 2:34 PM, Neha Narkhede <ne...@gmail.com>wrote:

> That does seem a little hacky. But I'm trying to understand the requirement
> behind having to deploy heterogeneous hardware. What are you trying to
> achieve or optimize?
>
> Thanks,
> Neha
>
>
> On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com> wrote:
>
> > Hi,
> >
> > I'm wondering if there's a good way to have a heterogenous kafka cluster
> > (specifically, if we have nodes with different sized disks).  So, we
> might
> > want a larger node to receive more messages than a smaller node, etc.
> >
> > I expect there's something we can do with using a partitioner that has
> > specific knowledge about the hosts in the cluster, but this feels messy,
> to
> > have this config on every producer client....
> >
> > Thoughts?
> >
> > Jason
> >
>

Re: heterogenous kafka cluster?

Posted by Neha Narkhede <ne...@gmail.com>.
That does seem a little hacky. But I'm trying to understand the requirement
behind having to deploy heterogeneous hardware. What are you trying to
achieve or optimize?

Thanks,
Neha


On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com> wrote:

> Hi,
>
> I'm wondering if there's a good way to have a heterogenous kafka cluster
> (specifically, if we have nodes with different sized disks).  So, we might
> want a larger node to receive more messages than a smaller node, etc.
>
> I expect there's something we can do with using a partitioner that has
> specific knowledge about the hosts in the cluster, but this feels messy, to
> have this config on every producer client....
>
> Thoughts?
>
> Jason
>

Re: heterogenous kafka cluster?

Posted by Maxime Brugidou <ma...@gmail.com>.
Have you thought about integrating Kafka into a distributed resource
management framework like Hadoop YARN (which would probably leverage HDFS)
or Mesos?
On May 23, 2013 11:31 PM, "Neha Narkhede" <ne...@gmail.com> wrote:

> This paper talks about how to do that -
> http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf
> It will be interesting to see what part of it Kafka can adopt, if any.
>
> Thanks,
> Neha
>
>
> On Fri, May 17, 2013 at 11:28 PM, Jason Rosenberg <jb...@squareup.com>
> wrote:
>
> > Letting each broker have a weight sounds like a great idea.
> >
> > Since in my use case, topics are generally auto-created, it won't be
> > practical to map brokers manually per topic.
> >
> > Thanks,
> >
> > Jason
> >
> >
> > On Fri, May 17, 2013 at 8:38 PM, Jun Rao <ju...@gmail.com> wrote:
> >
> > > In 0.8, you can create topics manually and explicitly specify the
> replica
> > > to broker mapping. Post 0.8, we can think of some more automated ways
> to
> > > deal with this (e.g., let each broker carry some kind of weight).
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I'm wondering if there's a good way to have a heterogenous kafka
> > cluster
> > > > (specifically, if we have nodes with different sized disks).  So, we
> > > might
> > > > want a larger node to receive more messages than a smaller node, etc.
> > > >
> > > > I expect there's something we can do with using a partitioner that
> has
> > > > specific knowledge about the hosts in the cluster, but this feels
> > messy,
> > > to
> > > > have this config on every producer client....
> > > >
> > > > Thoughts?
> > > >
> > > > Jason
> > > >
> > >
> >
>

Re: heterogenous kafka cluster?

Posted by Neha Narkhede <ne...@gmail.com>.
This paper talks about how to do that -
http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf
It will be interesting to see what part of it Kafka can adopt, if any.

Thanks,
Neha


On Fri, May 17, 2013 at 11:28 PM, Jason Rosenberg <jb...@squareup.com> wrote:

> Letting each broker have a weight sounds like a great idea.
>
> Since in my use case, topics are generally auto-created, it won't be
> practical to map brokers manually per topic.
>
> Thanks,
>
> Jason
>
>
> On Fri, May 17, 2013 at 8:38 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > In 0.8, you can create topics manually and explicitly specify the replica
> > to broker mapping. Post 0.8, we can think of some more automated ways to
> > deal with this (e.g., let each broker carry some kind of weight).
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com>
> wrote:
> >
> > > Hi,
> > >
> > > I'm wondering if there's a good way to have a heterogenous kafka
> cluster
> > > (specifically, if we have nodes with different sized disks).  So, we
> > might
> > > want a larger node to receive more messages than a smaller node, etc.
> > >
> > > I expect there's something we can do with using a partitioner that has
> > > specific knowledge about the hosts in the cluster, but this feels
> messy,
> > to
> > > have this config on every producer client....
> > >
> > > Thoughts?
> > >
> > > Jason
> > >
> >
>

Re: heterogenous kafka cluster?

Posted by Jason Rosenberg <jb...@squareup.com>.
Letting each broker have a weight sounds like a great idea.

Since in my use case, topics are generally auto-created, it won't be
practical to map brokers manually per topic.

Thanks,

Jason


On Fri, May 17, 2013 at 8:38 PM, Jun Rao <ju...@gmail.com> wrote:

> In 0.8, you can create topics manually and explicitly specify the replica
> to broker mapping. Post 0.8, we can think of some more automated ways to
> deal with this (e.g., let each broker carry some kind of weight).
>
> Thanks,
>
> Jun
>
>
> On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com> wrote:
>
> > Hi,
> >
> > I'm wondering if there's a good way to have a heterogenous kafka cluster
> > (specifically, if we have nodes with different sized disks).  So, we
> might
> > want a larger node to receive more messages than a smaller node, etc.
> >
> > I expect there's something we can do with using a partitioner that has
> > specific knowledge about the hosts in the cluster, but this feels messy,
> to
> > have this config on every producer client....
> >
> > Thoughts?
> >
> > Jason
> >
>

Re: heterogenous kafka cluster?

Posted by Jun Rao <ju...@gmail.com>.
In 0.8, you can create topics manually and explicitly specify the replica
to broker mapping. Post 0.8, we can think of some more automated ways to
deal with this (e.g., let each broker carry some kind of weight).

Thanks,

Jun


On Fri, May 17, 2013 at 2:29 PM, Jason Rosenberg <jb...@squareup.com> wrote:

> Hi,
>
> I'm wondering if there's a good way to have a heterogenous kafka cluster
> (specifically, if we have nodes with different sized disks).  So, we might
> want a larger node to receive more messages than a smaller node, etc.
>
> I expect there's something we can do with using a partitioner that has
> specific knowledge about the hosts in the cluster, but this feels messy, to
> have this config on every producer client....
>
> Thoughts?
>
> Jason
>