You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Ezra Stuetzel <ez...@gmail.com> on 2016/09/29 17:30:29 UTC

rack aware consumer

Hi,
In kafka 0.10 is there a way to configure the consumer such that it is rack
aware? We replicate data across all our 'racks' and want consumers to
choose brokers that are rack local whenever possible. Our configured racks
are actually in different datacenters so there is much higher network cost
of not consuming from nearest replica.

Configuring the consumer to only consume from specific hosts would also
achieve what we are trying to do if that is possible?

Also, are there any major downsides to using the rack setting for cross
datacenter replication?

Thanks,
Ezra

Re: rack aware consumer

Posted by Ian Wrigley <ia...@confluent.io>.
Unfortunately, that’s not the way Kafka works wrt Consumers. When a partition is replicated, only one replica is the Leader — all reads and writes are done via the Leader. The other replicas are Followers; their only job is to keep up with the Leader. No read requests from Consumers go to Followers.

Ian.

---
Ian Wrigley
Director, Education Services
Confluent, Inc

> On Sep 30, 2016, at 12:32 PM, Ezra Stuetzel <ez...@gmail.com> wrote:
> 
> Hi,
> Yeah I am aware of MirrorMaker. We tried to simplify our architecture so as
> to avoid needing to use MirrorMaker and just rely on the rack replication
> for cross datacenter replication. I think the only missing piece to this is
> making consumers only read from a subset of the nodes in the cluster,
> specifically the rack/datacenter local nodes.
> Thanks,
> Ezra
> 
> 
> On Fri, Sep 30, 2016 at 8:03 AM, Marko Bonaći <ma...@sematext.com>
> wrote:
> 
>> AFAIK (not actually using myself), for cross DC replication people tend to
>> use MirrorMaker to transfer one cluster's data to another, usually a kind
>> of central DC that unifies all "regional" DCs, but the layout depends on
>> your business reqs.
>> Then your consumer are assigned only with local brokers' addresses.
>> Exactly because of the reason you mentioned, high latency of consuming from
>> a remote broker and not being able to control partition assignment, i.e.
>> which broker becomes the leader if current leader fails, since this is
>> governed by the rule that says the most up to date in-sync replica becomes
>> the leader.
>> 
>> 
>> Marko Bonaći
>> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
>> Solr & Elasticsearch Support
>> Sematext <http://sematext.com/> | Contact
>> <http://sematext.com/about/contact.html>
>> 
>> On Thu, Sep 29, 2016 at 7:30 PM, Ezra Stuetzel <ez...@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> In kafka 0.10 is there a way to configure the consumer such that it is
>> rack
>>> aware? We replicate data across all our 'racks' and want consumers to
>>> choose brokers that are rack local whenever possible. Our configured
>> racks
>>> are actually in different datacenters so there is much higher network
>> cost
>>> of not consuming from nearest replica.
>>> 
>>> Configuring the consumer to only consume from specific hosts would also
>>> achieve what we are trying to do if that is possible?
>>> 
>>> Also, are there any major downsides to using the rack setting for cross
>>> datacenter replication?
>>> 
>>> Thanks,
>>> Ezra
>>> 
>> 


Re: rack aware consumer

Posted by Ezra Stuetzel <ez...@gmail.com>.
Hi,
Yeah I am aware of MirrorMaker. We tried to simplify our architecture so as
to avoid needing to use MirrorMaker and just rely on the rack replication
for cross datacenter replication. I think the only missing piece to this is
making consumers only read from a subset of the nodes in the cluster,
specifically the rack/datacenter local nodes.
Thanks,
Ezra


On Fri, Sep 30, 2016 at 8:03 AM, Marko Bonaći <ma...@sematext.com>
wrote:

> AFAIK (not actually using myself), for cross DC replication people tend to
> use MirrorMaker to transfer one cluster's data to another, usually a kind
> of central DC that unifies all "regional" DCs, but the layout depends on
> your business reqs.
> Then your consumer are assigned only with local brokers' addresses.
> Exactly because of the reason you mentioned, high latency of consuming from
> a remote broker and not being able to control partition assignment, i.e.
> which broker becomes the leader if current leader fails, since this is
> governed by the rule that says the most up to date in-sync replica becomes
> the leader.
>
>
> Marko Bonaći
> Monitoring | Alerting | Anomaly Detection | Centralized Log Management
> Solr & Elasticsearch Support
> Sematext <http://sematext.com/> | Contact
> <http://sematext.com/about/contact.html>
>
> On Thu, Sep 29, 2016 at 7:30 PM, Ezra Stuetzel <ez...@gmail.com>
> wrote:
>
> > Hi,
> > In kafka 0.10 is there a way to configure the consumer such that it is
> rack
> > aware? We replicate data across all our 'racks' and want consumers to
> > choose brokers that are rack local whenever possible. Our configured
> racks
> > are actually in different datacenters so there is much higher network
> cost
> > of not consuming from nearest replica.
> >
> > Configuring the consumer to only consume from specific hosts would also
> > achieve what we are trying to do if that is possible?
> >
> > Also, are there any major downsides to using the rack setting for cross
> > datacenter replication?
> >
> > Thanks,
> > Ezra
> >
>

Re: rack aware consumer

Posted by Marko Bonaći <ma...@sematext.com>.
AFAIK (not actually using myself), for cross DC replication people tend to
use MirrorMaker to transfer one cluster's data to another, usually a kind
of central DC that unifies all "regional" DCs, but the layout depends on
your business reqs.
Then your consumer are assigned only with local brokers' addresses.
Exactly because of the reason you mentioned, high latency of consuming from
a remote broker and not being able to control partition assignment, i.e.
which broker becomes the leader if current leader fails, since this is
governed by the rule that says the most up to date in-sync replica becomes
the leader.


Marko Bonaći
Monitoring | Alerting | Anomaly Detection | Centralized Log Management
Solr & Elasticsearch Support
Sematext <http://sematext.com/> | Contact
<http://sematext.com/about/contact.html>

On Thu, Sep 29, 2016 at 7:30 PM, Ezra Stuetzel <ez...@gmail.com>
wrote:

> Hi,
> In kafka 0.10 is there a way to configure the consumer such that it is rack
> aware? We replicate data across all our 'racks' and want consumers to
> choose brokers that are rack local whenever possible. Our configured racks
> are actually in different datacenters so there is much higher network cost
> of not consuming from nearest replica.
>
> Configuring the consumer to only consume from specific hosts would also
> achieve what we are trying to do if that is possible?
>
> Also, are there any major downsides to using the rack setting for cross
> datacenter replication?
>
> Thanks,
> Ezra
>