You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by xiaoyu wang <xi...@gmail.com> on 2012/04/18 20:35:35 UTC

can I start a kafka server that mirrors more than one cluster?

Hello,

I am trying out kafka mirroring. It seems to me that I can start a kafka
server (broker) to mirror one cluster only. So in order to mirror N source
clusters to one aggregate cluster, I will need to start N brokers with each
mirror one source cluster. Is this correct?

Also, is there a plan to enable non-random partitioning in the build-in
producer for mirroring?


Thanks,

-Xiaoyu

Re: can I start a kafka server that mirrors more than one cluster?

Posted by Jun Rao <ju...@gmail.com>.
Yes, that's a good point. I don't have a good solution other than creating
a customized version of mirror maker.

Thanks,

Jun


On Wed, Apr 18, 2012 at 2:15 PM, Joel Koshy <jj...@gmail.com> wrote:

> Yes, but the mirroring code instantiates a producer directly and needs to
> know the key's type. Same for the ProduceData - its type parameters and the
> key itself need to be specified in the code, so we can't use arbitrary
> configuration-driven key types. Or is there an easy work-around that you
> have in mind?
>
> Joel
>
> On Wed, Apr 18, 2012 at 1:46 PM, Jun Rao <ju...@gmail.com> wrote:
>
> > Producer actually supports partitioning by a partitioning key. So you
> just
> > need to select a partitioner (default is partitioning based on the hash
> > value of the key) and put in a key in ProduceData.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Apr 18, 2012 at 1:19 PM, xiaoyu wang <xi...@gmail.com>
> > wrote:
> >
> > > Thanks, Joey,  filed jira
> > >
> > > https://issues.apache.org/jira/browse/KAFKA-333
> > >
> > > On Wed, Apr 18, 2012 at 12:55 PM, Joel Koshy <jj...@gmail.com>
> > wrote:
> > >
> > > > > I am trying out kafka mirroring. It seems to me that I can start a
> > > kafka
> > > > > server (broker) to mirror one cluster only. So in order to mirror N
> > > > source
> > > > > clusters to one aggregate cluster, I will need to start N brokers
> > with
> > > > each
> > > > > mirror one source cluster. Is this correct?
> > > > >
> > > >
> > > > That's right. Also, you may want to take a look at the new mirroring
> > > > mechanism which will help address your scenario. It was recently
> added
> > > (in
> > > > trunk) so there's not much by way of documentation apart from the
> code
> > > > itself. It is pretty simple:
> > > > - Start up your mirror cluster
> > > > - Run kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config
> > > > cluster1_consumer.properties --consumer.config
> > > cluster2_consumer.properties
> > > > --producer.config mirror_producer.properties --whitelist=".*"
> > > >
> > > > I'll update that mirroring wiki soon, since the embedded consumer
> > > approach
> > > > is being deprecated in favor of the stand-alone mirroring tool.
> > > >
> > > >
> > > > > Also, is there a plan to enable non-random partitioning in the
> > build-in
> > > > > producer for mirroring?
> > > > >
> > > > >
> > > > This would be a good feature to have - can you file a jira to track
> it?
> > > I'd
> > > > have to think through how best to implement it (right now the
> > > partitioning
> > > > key type needs to be embedded in the code. The mirroring tool
> > > instantiates
> > > > a producer directly and assumes random partitioning.)
> > > >
> > > >
> > >
> > >
> > > > Thanks,
> > > >
> > > > Joel
> > > >
> > >
> >
>

Re: can I start a kafka server that mirrors more than one cluster?

Posted by Joel Koshy <jj...@gmail.com>.
Yes, but the mirroring code instantiates a producer directly and needs to
know the key's type. Same for the ProduceData - its type parameters and the
key itself need to be specified in the code, so we can't use arbitrary
configuration-driven key types. Or is there an easy work-around that you
have in mind?

Joel

On Wed, Apr 18, 2012 at 1:46 PM, Jun Rao <ju...@gmail.com> wrote:

> Producer actually supports partitioning by a partitioning key. So you just
> need to select a partitioner (default is partitioning based on the hash
> value of the key) and put in a key in ProduceData.
>
> Thanks,
>
> Jun
>
> On Wed, Apr 18, 2012 at 1:19 PM, xiaoyu wang <xi...@gmail.com>
> wrote:
>
> > Thanks, Joey,  filed jira
> >
> > https://issues.apache.org/jira/browse/KAFKA-333
> >
> > On Wed, Apr 18, 2012 at 12:55 PM, Joel Koshy <jj...@gmail.com>
> wrote:
> >
> > > > I am trying out kafka mirroring. It seems to me that I can start a
> > kafka
> > > > server (broker) to mirror one cluster only. So in order to mirror N
> > > source
> > > > clusters to one aggregate cluster, I will need to start N brokers
> with
> > > each
> > > > mirror one source cluster. Is this correct?
> > > >
> > >
> > > That's right. Also, you may want to take a look at the new mirroring
> > > mechanism which will help address your scenario. It was recently added
> > (in
> > > trunk) so there's not much by way of documentation apart from the code
> > > itself. It is pretty simple:
> > > - Start up your mirror cluster
> > > - Run kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config
> > > cluster1_consumer.properties --consumer.config
> > cluster2_consumer.properties
> > > --producer.config mirror_producer.properties --whitelist=".*"
> > >
> > > I'll update that mirroring wiki soon, since the embedded consumer
> > approach
> > > is being deprecated in favor of the stand-alone mirroring tool.
> > >
> > >
> > > > Also, is there a plan to enable non-random partitioning in the
> build-in
> > > > producer for mirroring?
> > > >
> > > >
> > > This would be a good feature to have - can you file a jira to track it?
> > I'd
> > > have to think through how best to implement it (right now the
> > partitioning
> > > key type needs to be embedded in the code. The mirroring tool
> > instantiates
> > > a producer directly and assumes random partitioning.)
> > >
> > >
> >
> >
> > > Thanks,
> > >
> > > Joel
> > >
> >
>

Re: can I start a kafka server that mirrors more than one cluster?

Posted by Jun Rao <ju...@gmail.com>.
Producer actually supports partitioning by a partitioning key. So you just
need to select a partitioner (default is partitioning based on the hash
value of the key) and put in a key in ProduceData.

Thanks,

Jun

On Wed, Apr 18, 2012 at 1:19 PM, xiaoyu wang <xi...@gmail.com> wrote:

> Thanks, Joey,  filed jira
>
> https://issues.apache.org/jira/browse/KAFKA-333
>
> On Wed, Apr 18, 2012 at 12:55 PM, Joel Koshy <jj...@gmail.com> wrote:
>
> > > I am trying out kafka mirroring. It seems to me that I can start a
> kafka
> > > server (broker) to mirror one cluster only. So in order to mirror N
> > source
> > > clusters to one aggregate cluster, I will need to start N brokers with
> > each
> > > mirror one source cluster. Is this correct?
> > >
> >
> > That's right. Also, you may want to take a look at the new mirroring
> > mechanism which will help address your scenario. It was recently added
> (in
> > trunk) so there's not much by way of documentation apart from the code
> > itself. It is pretty simple:
> > - Start up your mirror cluster
> > - Run kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config
> > cluster1_consumer.properties --consumer.config
> cluster2_consumer.properties
> > --producer.config mirror_producer.properties --whitelist=".*"
> >
> > I'll update that mirroring wiki soon, since the embedded consumer
> approach
> > is being deprecated in favor of the stand-alone mirroring tool.
> >
> >
> > > Also, is there a plan to enable non-random partitioning in the build-in
> > > producer for mirroring?
> > >
> > >
> > This would be a good feature to have - can you file a jira to track it?
> I'd
> > have to think through how best to implement it (right now the
> partitioning
> > key type needs to be embedded in the code. The mirroring tool
> instantiates
> > a producer directly and assumes random partitioning.)
> >
> >
>
>
> > Thanks,
> >
> > Joel
> >
>

Re: can I start a kafka server that mirrors more than one cluster?

Posted by xiaoyu wang <xi...@gmail.com>.
Thanks, Joey,  filed jira

https://issues.apache.org/jira/browse/KAFKA-333

On Wed, Apr 18, 2012 at 12:55 PM, Joel Koshy <jj...@gmail.com> wrote:

> > I am trying out kafka mirroring. It seems to me that I can start a kafka
> > server (broker) to mirror one cluster only. So in order to mirror N
> source
> > clusters to one aggregate cluster, I will need to start N brokers with
> each
> > mirror one source cluster. Is this correct?
> >
>
> That's right. Also, you may want to take a look at the new mirroring
> mechanism which will help address your scenario. It was recently added (in
> trunk) so there's not much by way of documentation apart from the code
> itself. It is pretty simple:
> - Start up your mirror cluster
> - Run kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config
> cluster1_consumer.properties --consumer.config cluster2_consumer.properties
> --producer.config mirror_producer.properties --whitelist=".*"
>
> I'll update that mirroring wiki soon, since the embedded consumer approach
> is being deprecated in favor of the stand-alone mirroring tool.
>
>
> > Also, is there a plan to enable non-random partitioning in the build-in
> > producer for mirroring?
> >
> >
> This would be a good feature to have - can you file a jira to track it? I'd
> have to think through how best to implement it (right now the partitioning
> key type needs to be embedded in the code. The mirroring tool instantiates
> a producer directly and assumes random partitioning.)
>
>


> Thanks,
>
> Joel
>

Re: can I start a kafka server that mirrors more than one cluster?

Posted by Joel Koshy <jj...@gmail.com>.
> I am trying out kafka mirroring. It seems to me that I can start a kafka
> server (broker) to mirror one cluster only. So in order to mirror N source
> clusters to one aggregate cluster, I will need to start N brokers with each
> mirror one source cluster. Is this correct?
>

That's right. Also, you may want to take a look at the new mirroring
mechanism which will help address your scenario. It was recently added (in
trunk) so there's not much by way of documentation apart from the code
itself. It is pretty simple:
- Start up your mirror cluster
- Run kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config
cluster1_consumer.properties --consumer.config cluster2_consumer.properties
--producer.config mirror_producer.properties --whitelist=".*"

I'll update that mirroring wiki soon, since the embedded consumer approach
is being deprecated in favor of the stand-alone mirroring tool.


> Also, is there a plan to enable non-random partitioning in the build-in
> producer for mirroring?
>
>
This would be a good feature to have - can you file a jira to track it? I'd
have to think through how best to implement it (right now the partitioning
key type needs to be embedded in the code. The mirroring tool instantiates
a producer directly and assumes random partitioning.)

Thanks,

Joel